
The Stata do file code_zhouetal_2019.do describes the code flow.

Stata:
1. The datasets 'addSpilloverVariable_modified2.dta' and 'addYearSpecificSpillover4.dta' are the input data for runnning code_zhouetal_2019.do.
Matlab:
1. The dataset 'ParticipationWithSpillover_addEstiProb.mat' is the input data for code ReviseEstimateProbab3.m, and its output data is used as input data for code deriveaddYearSpecificSpillover4.m.
2. The output data from deriveaddYearSpecificSpillover4.m is used as input data for code deriveProgramEffect5.m.
3. The output data from deriveProgramEffect5.m is used as input data for code deriveReductionPercentage6_2.m.
(Note dataset 'addSpilloverVariable_modified2.dta' is the original data. 'addYearSpecificSpillover4.dta' and 'ParticipationWithSpillover_addEstiProb.mat' are intermediate output data. They are also attached here for replication purpose.)


Data source: 
The source of the Data are from EPA��s TRI. Information on facility-specific participation status from all 33/50 participating parent firms was obtained from Hampshire Research. Specifically, the TRI contains chemical and facility specific information on toxic releases to different environmental media, from which, we calculate the 33/50 releases, the HAP releases and total TRI releases per facility. It also provides information on SIC codes, names and Dun and Bradstreet (D&B) numbers of parent company and facility locations. These facilities are identified as belonging to 4,123 parent companies using parent company names and D&B numbers reported in TRI, of which 1,203 parent companies participated in the 33/50 program. We use company names reported in (33/50 Program office, 1991, U.S. Environmental Protection Agency, 1992) to identify the parent companies that were in the first invitation group and use that information to determine the facilities that belonged to these firms. The EPA reports that 517 firms were invited first (in March 1991) to participate in the program. We obtain the list of 33/50 participating companies and information about the participation status of each of their facilities through personal communication with Hampshire Research. Facility-specific data on the numbers of violations, penalties and inspections for compliance with mandatory air regulations are obtained from EPA��s AIRS Facility Subsystem (AFS) database (U.S. Environmental Protection Agency, 2008). The reported location of a facility in the TRI data-set was used to merge the above data with county��s per capita income  from BEA (Bureau of Economic Analysis). We obtain county��s attainment status  from the EPA��s Nonattainment areas for criteria pollutants (green book).
 

